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1. INTRODUCTION 

The era of telecommunications began in 1876, where a network was built that enabled 2 parties to 
transmit their voice and communicate.The Internet began in 1969s, funded by Advanced Research Project 
Agency [1]. Using Internet Protocol (IP) as an address, the request from the user will be forwarded to the 
server, through other nodes within the network. The replies tothe request will be sent to the user through a 
particular path that has been formed by routing process in the network. If any user requests the same data, 
then the packet will be sent again from the server to the user. This causes inefficient packet delivery process 
because the packet is always sent from a server that is far from the user. To solve this problem, the concept of 
Content Distribution Network [2] was proposed. A replica server is created contains all the data as in the 
main server, placed at a fixed location, closer to the user. So that, the request for certain content will be 
redirected to the replica server and it is no need to be served by an origin server that is farther away. 

The replica server is updatedperiodically or when any content changes on its original server. 
However, this system will be difficult to support mobility and dynamic changing content request from 
consumers. When the consumer away from the replica server, it leads to the possibility that a consumer can 
no longer be served efficiently by the replica server. Content Distribution Network that is still based on 
Internet Protocol (IP) causes the request process from the user is always addressed to a particular server. 
Consequently, anadditional process is still needed tomapping the intended IP with the server position that is 
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closest to the user. Actually, from the beginning, the focus of the user request is the content (content-based), 
but in the previous system, the request is addressed to a certain server node with a certain IP (host-based). 

In 2009, Jacobson et al. propose a content-based network paradigm [3]. This concept has been raised 
a few years earlier in its research projects and it is named Content-Centric Networking (CCN) originally 
developed at Xerox's Palo Alto Research Center. It is currently developing into Named Data Network (NDN) 
initiated by the NSF-Funded Future Internet Architecture Project [4]. This concept replaces the 'where' 
paradigm to the concept of'what', where the consumer request is no longer addressed to a specific node but it 
is intended for a certain content [3], [5]-[7]. This paradigm causes the response to the content requests not 
only served by a particular server but also can be served by the nearest device which stores the requested 
data. To support this concept, the NDN router nodes are equipped with content storage to store the 
data [3]-[6], [8]. 

The concept of caching on Named Data Network is different from caching in the previous system. 
Each NDN node has a content storage to hold data. Different with the previous network, node mobility will 
be supported because the content store can be tailored to the user's demand pattern for the content. Changes 
in user positions cause the router has to re-customize the contents in the content storeaccording to the user 
requests in the local area. Cachein NDN is more dynamic. The NDN architecture supports flexible network 
topologies, where wireless nodes can enter and exit the area. One time, a node can be a producer and 
sometimes it turns into theconsumer. In the NDN node can be embedded with various cache rules, including 
to determine which content will be selected and deleted from content store [5], [9], [10], the selection of 
places where a content will be cached [10]-[13], and cache policy that imply the model of cooperation 
between nodes to determinethe caching decision [14]-[17]. Related to node mobility, several techniques have 
been studied to maintain the performance of the system even though the nodes move in and out of 
coverage [18]-[21]. 

This survey paper explains the development of caching techniques on Named Data Network which 
is an important basis for understanding the latest NDN caching techniques in developing better future 
techniques for enhancing NDN performance as an efficient forward communication solution. The brief 
explanation of advantages and disadvantages are presented to make it easy to understand. Finally, proposed 
the open challenge related to the caching mechanism to improve NDN performance. 

The remainder of this paper is organized as follow. In section 2 described the state of the art of 
caching on the Named Data Network. The caching techniques are grouped into cache placement, cache 
content selection, and cache policy design. In this sectionalso explained the advantages and drawbacks of 
each group of caching techniques. The technique to support mobility are explained too. In section 3 described 
the challenge and open issues related to the caching in Named Data Network. And finally, section 4 put 
forward the conclusion of this paper. 


2. RELATED WORK 

Some survey papers on caching have been done before. Paper [22] emphasizes discussion on 
techniques of cache replacement for web services in theIP-based system. Paper [23] emphasizes discussion 
on several in-network caching mechanisms in Information-Centric Networking, 2014 and earlier. Paper [5] 
discusses information-centric mobile caching, including caching in cellular, ad hoc, and ahybrid network. 
This paper explains different cache location for each scheme and some cache mechanisms. Discussion on 
caching replacement mechanism only, presented by paper [22] and [24], for themobile node. Due to the 
author's knowledge, so far there has been no paper survey that discusses the latest caching techniques and 
mapping the technique in groups based on the basic technique. This scheme makes it easy to understand the 
basic techniques of the caching mechanisms. 

This survey paper focuses on caching techniques, including the recent studies. To make it clear to 
explain, in this paper the caching mechanism is divided into 4 general group, they are cache placement, cache 
content selection, cache policy design andcaching for mobile nodes. The explanation begins with the caching 
differences in Named Data Network with its previous network, the urgency of caching discussion on Named 
Data Network to improve its performance, and then mapping the caching techniques based on the basic 
mechanism. The aim is to facilitate the reader to know the basis of the development of these caching 
techniques. The advantages and disadvantages of each group of techniques are presented with a succinct and 
focus to make it easy to understand. This paper concludes with an explanation of the proposed research on 
caching on NDN that is still open, so it can continue to be developed to examine the best techniques to 
support NDN. 
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3. STATE OF THE ART OF CACHING ON NAMED DATA NETWORK 
3.1. Component of Named Data Network Router 

Named Data Network shifts the 'where' to 'what' paradigm. The user sends his request for a content 
to the network, and then the network woulddetermines who is the most efficient node that can serve this 
request. So, the user does not need to know where is the content server. This paradigm causes the reply for a 
request not always be done by the server, but any node that is in the network. The NDN architecture causes 
data communication processes more efficient and network loads will be significantly reduced. 

The NDN Node consists of 3 components, namely Content Storage (CS), Pending Interest Table 
(PIT) and Forwarding Information Based (FIB) [3], [25]. When consumer B wants a data from the producer, 
the consumer will send a request for certain content using the Interest Packet. The NDN router that receives 
the request from the consumer will check whether the content is in its CS. If there is, the router will 
immediately send the requested data to the consumer. If the data is not in CS, then the router checks the 
Pending Interest Table to see if the content has been requested and has not been replied with matching data 
packets. If in the PIT there is such information, then the information will be updated by adding information 
that consumer B also requested the same data. The information on this PIT makes a reverse path for sending 
data to the consumer. If in the PIT there is no data request content that is the same as consumer B, then 
checks are made on Forwarding Information Based (FIB). The interest packet will be forwarded to the data 
provider node according to the information in FIB. If FIB does not store the content provider's node data, the 
interest packet will be discarded by the NDN router. This process described in Figure 1. 
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Figure 1. Processes that occur on each NDN router when it receives the Interest Packet [25] 


3.2. Content Storage 

Content Storage (CS) is one of the important components in the NDN router node. CS is essential to 
allow the data to be stored in NDN router nodes so that if the consumer request for a content, it is not 
necessarily served by the certain server, but can be served by a router node that has the content in its CS. CS 
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is one of the limited resources on NDN routers. Therefore it should be utilized as efficiently as possible in 
order to improve NDN performance. 

The size of the content store affects the delay and number of hops that packets must take to go to 
consumer [26]. This condition affects the overall network load due to the circulation of data in the 
network [4], [8]. CS also performs different effect with the various cache policy implemented in the node 
[27]. In this paper, the caching strategies are classified as cache placement, cache content selection, cache 
policy design and caching for themobile node.Each group described, including its advantages and drawbacks 
in sections 3.3 to 3.6. 


3.3. Cache Placement 

Cache placement focuses on determining which nodes will store a data packet. In the 
Publisher/Subscriber network, it has been proposed a methodto choose a node to store packets based on local 
content popularity and content storage capacity per node [28]. In NDN networks, packets are initially placed 
on every node in the network so that the consumer can directly access the content to the closestnode. 

Paper [11] proposes a packet data flooding mechanism, where data packets are stored in all nodes 
that are in the best path but limited to the maximum number of hops for the spread of the packet. In paper 
[12] the package is deployed to be stored in network nodes but still maintained to ensure there are no 
redundant packets, to save resources, using bloom filter. The lack of bloom filter technique relatedto false 
positive problems corrected by A. Hidayat et al [13]. In this technique, bloom filter is combining with 
sequential search algorithm. 

Paper [29] proposes a mechanism that combines the technique of packet insertion and packet 
deletion by adding a Caching Contribution parameter in the interest packet. The node will decide whetherit 
will cache the data packet or not. If the data packet cannot be cached on the certain node, then it will be 
forwarded to another node. A trail mechanism is built to stores information about the path to the next node 
that can store the content.In paper [10], nodes that often get a content request from consumer have a high 
contribution value. A node will store a content that has high contribution value if storage capacity is 
available. Paper [9] proposes the movement of data toward the edge router closest to the consumer for every 
specific content request. The cache placement can be resume into 3, i.e. function based, diversity and 
flooding as shown in Figure 2. Comparison of the three techniques, including the technical focus, the 
advantages, and drawbacks described in Table 1. 


Table 1. Comparison of Cache Placement Techniques 
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Figure 2. Classification of cache placement techniques 
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3.4. Cache Content Selection 

The cache content selection techniques focus on determining which content will be cached and 
which content should be removed from the cache. Some of the content selection techniques to cache are 
Caching Everything Everywhere (CEE) [3], [23], where each node stores all of the data from the producer 
and it means no content selection and Prob (p) [3], [5], [23] where data is cached with probability p and not 
cached with probability 1-p. As a result, data packets that are cached by one router may be different from the 
other routers. Paper [30] proposes the concept that every router cache the data with the probability 
determined by the number of hop between producer and the router.Selection of content to be cached based on 
the prediction that the content will be requested by the local consumer proposed by paper [31]. Related to the 
cache content selection, content centric network performance is also affected by CS replacement rules and 
user localization [32]. The cache content selection can be resume as in Table 2. 


Table 2. Comparison of Cache Content Selection Techniques (Insertion and Eviction) 


Classification Main focus advantages Drawbacks 
Focus on packet 
; i Its alread; dated th : Miaa 
Popularity selection based on the EA EEO less popular content can be omitted, while it is 


selection of content based on 


[29] [8] [36] number of requests for : still needed or requested by some consumers 
consumer interest 
the packet. 
Emphasize the selection 
Probability of packages/ COPR, Sa ud: Some nodes may store the same content 
with a certain More fair in determining the ea ; ; 
[3] [5] [23] a Need specific strategies to determine the 
probability. A packet package to be cached or deleted. ne 
[37] : probability 
can be cached with a 
certain probability. 
Tighten the selection of 
Predic- packages based on -  Avoidstoring unnecessary - The prediction may be incorrect if the 
tisnbased predictions whether the content condition of the network or user changes. 
[34] 31] selection of content will - Accommodate the future - Internal calculation of the router is more 
provide the target value needs of the user complex. 


set. 


Another technique related to the cache content selection is Prediction-based caching [33]. The 
content will be decided to be cached by router based on the number of requests. In this scheme, it is added a 
new table in the router, named the Pending Species Interest Table (PSIT). This table stores the list of the 
most requested content based on data in the PIT. Suppose there is content that is regularly requested by the 
consumer every Monday, but there is also non-regular content, for example, the contents of the World Cup 
event. After that, Dynamic Cache Adjustment algorithm is used to decide a package that will be cached or 
not based on its wastage value. A content will be viewed in size. If the CS is still sufficient, the package is 
stored. If the CS is full then the packet in the CS will be select randomly and then compared it with the new 
data packets. If they are both same, the value of the hit parameter will increase. Re-testing is done by 
comparing the hit parameter with the amount of data that has been sorted. If the hit value is higher, the packet 
is given allocation in the buffer, and otherwise, the content is not allocated in CS. Selection of a content can 
also be calculated based on local popularity and hop count reduction gain that can be given by the 
packet [29]. 

Another content selection technique is Max-Gain In-network Caching (MAGIC) [34]. The proposed 
method aims to reduce bandwidth consumption and consider content popularity as well as hop reduction. 
When receiving the interest packet, each router will calculate the Local Gain and compare it with the value 
stored on the MaxGain field. If the local cache of the router gain is greater than the MaxGain value, then the 
router will update the MaxGain value in the interest packet. This MaxGain value will be copied on additional 
fields in the data packet. Along the packet delivery path, if the Local Gain value is the same as the MaxGain 
value in the data packet, it will be cached in the data packet. 

If a data packet enters the router node and the router didn’t have it in its Content Storage, then the 
node will check its Content Storage condition. If it is full, then it will be selected which packet will be 
deleted from Content storage to provide space to store the new packet. Techniques that are commonly used in 
the NDN system to select which packets will be deleted in CS is Least Recently Used (LRU) and Least 
Frequently Used (LFU) [3], [29], [22]. Deghgan et al in the paper [2] proposed another technique to give a 
timer to a package. The timer is used to determine how long a packet may be in the content storage before it 
is finally deleted. Paper [35] proposed the Recent Usage Frequency (RFU) algorithm, which determined the 
popularity of content within a limited time range. The lowest popularity value will cause a content to be 
removed from the content store. 
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According to the paper [24], the performance of caching can be improved by using efficient caching 
replacement methods. In mobile networks, this is a challenge, becausethe environment is different from the 
fixed network conditions. The parameters used by the replacement rule include recency, popularity, message 
size, cost to achieve objects, and access delay [24]. The cache content selection techniques can be resume as 
in Figure 3 and the comparison of cache content selection techniquesas in Table 2. 


Cache Content Selection 
(Insertion and Eviction) 


Function Prediction 
based based 


Popularity Probability 


Figure 3. Classification of cache content selection techniques 


3.5. Cache Policy Design 

Cache policy focuses on techniques how content is stored in nodes. One of the cache policy related 
techniques is Utility-driven caching [8]. This technique is a utility-driven caching technique in which a utility 
value is linked to a content. Utilities are a function of a hit possibility of content. The goal is to maximize the 
total amount of utility content in content storage. 

Paper [38] modeled the cache on its system into 2 layers. The first layer is the individual caching in 
each node and layer 2 is the accumulation of all the cache on the network. The study analyzes how much 
storage content should be provided in the system to meet the performance of 4 applications, i.e. web traffic, 
file sharing, and video traffic that are distinguished into user-generated content (UGC) and video on 
demand (VoD). 

Assantachai et al [14] proposed a hybrid caching scheme. If any new content is requested by the 
consumer and not exist yet on the router node, then the new content will be saved. The content replacement 
scheme used is a combination of the concept of a cooperative approach and distributive approach. 
Cooperative caching is a scheme in which each node makes a replacement decision based on the knowledge 
received from other nodes residing in the same region. Distributive caching is used to make decisions 
independently using internal knowledge to achieve local maximum performance. In paper [14] the network is 
divided into 2 parts, that is the normal region (region on the edge) and the backbone region (the region that 
connects the normal regions). In the normal region, if there is a cache hit interest, the content is moved to the 
front of the sequence, and when the cache misses then the data at the tail of the sequence is removed. The 
backbone region follows the normal region pattern, only the backbone nodes work with other nodes in the 
same region to decide to cache. Cooperative caching policy design is also used in [39] with areas divided into 
clusters 

Paper [15], [40] described that the mechanism to cache a content has a crucial impact on the 
efficiency of content delivery and utilization of CS. Paper [9] proposes the mechanism to divide files into 
smaller packets called chunk. The amount of chunk disseminated depends on the popularity of the content. 
The number of chunks is determined by the Chunk Marking Window (CMW) which exponentially enlarges 
every number of chunks successfully delivered 

In [41] Content-Centric network is implemented using two types of applications. For each 
application, it is created a separate list and each identified with a unique ID. The CS is separated and each 
application can only be stored in its own content store. The storage content partition mechanism is tested with 
two methods: static cache partitioning and dynamic cache partitioning. In static partitioning, the cache can 
only be used as specified. While in dynamic cache partitioning, unused cache by an application can be shared 
with other applications. Cache with splitting technique also proposed in [42]. The content storage is divided 
into two part, one part for a popular content and the other for less popular content. Paper [43] split the content 
storage into three regions. The data is categories as a self-data, friends data, and stranger data. Paper [16] 
more specific on caching management in memory where multiprocessor is used with certain interconnect 
mechanisms to reduce power usage. 

Caching techniques that coupling data cache placement, replacement, and location was proposed by 
Xiaoyan Hu, et al. [29]. To set the packet to be cached, it is defined a caching value for each packet o that 
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can be cached at node v. This caching value involves multiplication of local popularity value and hop count 
reduction gain of the item, then divided by cache space contention which is the same value in all routers. If 
an interest goes to node v, and the item wasnot cached yet on the node v before, node v will calculate the 
approximate potential value of the caching contribution of the item. The data will be cached at node v if the 
maximal value of caching contribution is positive. If the content storage is full, it will select a package with 
the least contribution caching value to delete. Related to the caching location determination, the cache 
location component will maintain the trail to guide the content. This trail is only created if the content is not 
cached on the local node. The cache policy design can be resume as in Figure 4. The comparison of cache 
policy design techniques as in Table 3. 


Cache Policy 
Design 


| 
| | 


Cooperative Independent Hybrid 


Figure 4. Classification of cache policy design 


Table 3. Comparison of cache policy design techniques 


Classification Main focus Advantages Drawbacks 
- Need additional mechanisms to be 
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- There is no need for 
Independent Caching-related decisions are additional mechanisms 
[8] [31] [42] performed by the node regardless of for monitoring and - Can not do resource sharing 
[38] [45] [46] information from other nodes sharing information with 
other nodes 
Hybrid Merging between cooperative and Canmore rieiently ae iors a r 
y MEAR Coop apply certain mechanis p : i 
[14 independent techniques. mechanism 


ms to specific conditions ; 
P - Add the computation process 


3.6 Caching for Themobile Node 

Generally, caching techniques for mobile nodes have a basic idea for subscribing a user to a content 
producer [28], [47], prefetching content to other router that will handle consumer [19], [20], collaborate the 
data transmission mode for VANET [44], and mobile node support techniques that consider an energy [48]. 
In the mobile environment, the problems are NDN nodes always move, including routers, producers, and 
consumers [21]. The producer movement causes a greater problem than the movement of the consumer node 
or router node. Problem-solving related to producer movements is presented by paper [28]. 

The publish/subscribe system is the mechanism by which the subscriber can receive messages from 
the publisher. This relationship is governed by the manager so auser who subscribes to certain content will 
always get the content they want when publisher generates the content [47]. In the pre-existing 
pub/subsystem, theproducer does not store messages that have been published before. In this case, if new 
subscribers join the system, they could not get the content that has been published before theyenter the 
system. To solve the problem, [28] proposed storage mechanism and replication algorithm with differentiated 
content class. In this new system, storage can convert the content classes they store. The proposed replication 
algorithm is to select M storage points from N points that are available in the network based on locality and 
popularity, target replication degree of each topic, and storage capacity 

A technique for accommodating consumer mobility in wireless networks is Proactive Multilevel 
Cache Selection (PMCS), proposed by paper [18]. In this scheme, if the consumer will switch coverage or 
handoff, the consumer will send a notification about which router to go to. The currently used router will 
select a subset of neighboring routers to receive content that has been requested by the consumer but has not 
yet been sent to it. When a handoff occurs, the consumer will stop requesting to send data. During this 
handoff process also, the destination router will cache the data packets from the old router, which has not 
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been received by the consumer up to a certain limit. Once the connection to the new router has been 
established, then the data transmission will be served by the new router. Another technique is proposed by 
[19] to predict node mobility and provide the best prefetching node. 

Paper [20] explain the mechanism to support producer mobility, such as push to send the data, make 
some copies of data, determine the content placement, and re-announce if they move to another area. 
Paper [44] propose VANET's communication mode switching, Vehicle-to-vehicle (V2V) and vehicle-to- 
infrastructure (V2I), depending on the popularity of downloaded content. Mobile node has the limited power, 
so the caching process has to consider the energy consumption in the node, due to green NDN as explained in 
[49]. Paper [48] proposed an energy efficient techniques for MANET. The network is divided into groups, 
managed by a Master Node. Paper [50] proposed a technique with optimal selection of cluster head in 
Wireless Sensor Network to improve efficiency. 


4. CHALLENGE AND OPEN ISSUES 
4.1. QoS-based Caching 

In all caching techniques, either cache placement, cache content selection, or cache policy design 
that has been developed mostly have not considered the different treatment for different services. In studies 
that have been done, the data usually only differentiated based on content popularity, content recently, the 
estimated benefits of content storage, etc. There are only afewof studies that take into account the treatment 
differentiation based on service requirements or user requirements. In fact, different users may subscribe to 
different privileged services. So far, not much research has been done related to QoS-based caching on NDN. 
Paper [45] is one of the papers that discuss this distinction using classes. 

The concept follows the Differentiated Service (DiffServ) concept that was previously used in the IP 
network. Further development is needed for caching mechanisms that can meet different requirements for 
services and users. These techniques include how to choose content and where tocache them in the network. 
The decision can be taken independently or cooperatively with other nodes in the network. 


4.2. Caching for Mobile Node 

Node mobility must be supported to provide the flexibility of the system. Generally, mobility 
characteristic is divided into producer mobility dan consumer mobility. Router mobility is similar to the 
consumer mobility. Consumer mobility is naturally supported by NDN, but not so with producer mobility. 
So, the area of the producer-mobility support technique is one of research opportunities. Several techniques 
are presented related to cache in the mobile node to support producer mobility. For Example, in the paper 
[18] pre-fetching content is proposed. This scheme was done when the mobile node moves to the new 
coverage router. Another proposed method is to prefetch a group of content, not just 1 content, which is 
usually requested by the consumer [31]. Pre-fetching causes additional time needed to move content to a new 
router. Further investigation of other techniques related to node mobility support for NDN is required to 
ensureuninterrupted data communicationseven if the user switches coverage by considering the expected 
delay, cache load, and the complexity of the algorithm that must be executed. 


4.3. Energy-aware Caching 

NDN routers in themobile wireless network will have power restrictions. Caching techniques that 
consider the availability of power on the node also need to be explored further. This process may include 
selecting nodes to place content based on position, distance, energy availability at the node, resource 
availability and other important things that should consider process efficiency. Covering a technique that can 
reduce the number of replacements that occur. If a content is too frequently removed from the cache, it will 
not be efficient. 


4.4. Type of Data on Content Store 

Currently, the cached content on the NDN router can be either a file or smaller, called 
chunk [9] [44]. Chunk-based systems will make the transmission process more efficient because if a chunk is 
lost during transmission or it is deleted in CS, it only needs to be replaced with a new chunk without having 
to replace the whole file. However, the division of the file into chunk causes the user's queries to be 
generated chunk-based. This means that in the chunk-based system, the interest packet for a complete file 
more than the file-based system. Further exploration of caching procedures and mechanisms regarding this 
form of data should be explored. 
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5. CONCLUSION 

In this paper, we have explicate the advantages of NDN network architecture compared to 
traditional IP network and Content Distribution Network, and excess caching on NDN compared to its 
predecessor system. The development of various caching techniques has been mapped out.In this paperalso 
explained the advantages and drawbacks of each group. Finally, it has been suggested the research 
opportunities related to caching on NDN that can be investigated in the future, i.e. caching mechanisms that 
involve differences in QoS requirements for data and users, caching that supports mobility nodes, and 
caching that considers energy. 
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